Introduction

For our case study, the primary focus is to figure out which compound is the best biomarker for recent use while also deciphering which matrix is most optimal as well. In further detail, the compound, which could be THC, CBN, etc., falls within one of three matrices of the following: blood, breath, and oral fluid. Within the time frame of three hours or 180 minutes, we would like to figure out which compound and it’s respective matrix is the most potent so that we could extrapolate our findings to real life applications. The reason as to why this study is important is because accurately testing for marijuana usage is very important in different everyday scenarios. For example, THC in marijuana can affect an individuals motor skills, depth perception, and overall cognition. This then can hinder their ability to work effectively and safely. According to the National Institute on Drug Abuse, it is noted that employees that tested positive for marijuana had an increase of 55% when it came to workplace accidents and that they were responsible for another increase of 85% of work-related injuries 1. These liabilities can hurt the company and more importantly, the individual; hence why it is paramount for companies to run effective drug tests on their employees. Another example why finding out which compound and matrix is most effective to use when conducting a drug test is for scenarios in which we’d like to find out if a driver is under the influence or not. Quoted directly from the National Highway Traffic Safety Administration, “In the 2013-2014 survey 2, 12.6 percent of weekend nighttime drivers tested positive for marijuana. That’s a 48-percent increase in less than 10 years” 3.

All of this information is quite alarming and that is why we want to figure out which compound and matrix is the most effective to analyze when trying to figure out if an individual is under the influence or not. It is also worth mentioning that we’d like to go deeper with this study by trying to find out if some of these compounds are more sensitive to higher or lower doses of marijuana. The relationship between these variables can provide us with substantial information in regards to figuring out if some compounds are worth paying more attention to than others. This ultimately saves the tester a lot of time when they’re running a drug test on an individual.

Load packages

library(tidymodels)
library(tidyverse)
library(dplyr)
library(ggplot2)
library(janitor)
library(purrr)
library(rstatix)
library(cowplot)

Question

  • Which compound, in which matrix, and at what cutoff is the best biomarker of recent use? (recent use is defined as 3h)
  • “do some compound respond more to high dose/low dose compared to the other?”

The Data

Data Import

WB = read.csv("data/Blood.csv")
BR = read.csv("data/Breath.csv")
OF = read.csv("data/OF.csv")

Data Wrangling

We re-coded and re-leveled variables (Treatment and Group), cleans column names, and renames specific columns (x11_oh_thc to thcoh, thc_v to thcv, thccooh_gluc to thc_cooh_gluc, and thccooh to thc_cooh) in all 3 tables. Using janitor package to organized column names.

**Suggestion: add a new column recording whether someone should be determined to have recent THC use (THC group, within 3 hr of smoking) or not(all placebo group, or THC group outside the 3hr window)

OF <- OF |>
  mutate(Treatment = fct_recode(Treatment, 
                                "5.9% THC (low dose)" = "5.90%",
                                "13.4% THC (high dose)" = "13.40%"),
         Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)"),
         Group = fct_recode(Group, 
                            "Occasional user" = "Not experienced user",
                            "Frequent user" = "Experienced user" )) |>  
  janitor::clean_names() |>
  rename(thcoh = x11_oh_thc,
         thcv = thc_v)

WB <- WB |> 
  mutate(Treatment = fct_recode(Treatment, 
                                "5.9% THC (low dose)" = "5.90%",
                                "13.4% THC (high dose)" = "13.40%"),
         Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)")) |> 
  janitor::clean_names() |>
  rename(fluid = fluid_type,
         thcoh = x11_oh_thc,
         thccooh = thc_cooh,
         thccooh_gluc = thc_cooh_gluc,
         thcv = thc_v)

BR <- BR |> 
  mutate(Treatment = fct_recode(Treatment, 
                                "5.9% THC (low dose)" = "5.90%",
                                "13.4% THC (high dose)" = "13.40%"),
         Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)"),
         Group = fct_recode(Group, 
                            "Occasional user" = "Not experienced user",
                            "Frequent user" = "Experienced user" )) |> 
  janitor::clean_names() |> 
  rename(thc = thc_pg_pad)


compounds_WB <-  as.list(colnames(Filter(function(x) !all(is.na(x)), WB[6:13])))
compounds_BR <-  as.list(colnames(Filter(function(x) !all(is.na(x)), BR[6])))
compounds_OF <-  as.list(colnames(Filter(function(x) !all(is.na(x)), OF[6:12])))

Created 3 tables based on specific minutes and labeled accordingly, covering pre-smoking and subsequent post-smoking time periods for blood, breath, and oral fluid data.

timepoints_WB <- tibble(
  start = c(-400, 0, 30, 70, 100, 180, 210, 240, 270, 300),
  stop = c(
    0,
    30,
    70,
    100,
    180,
    210,
    240,
    270,
    300,
    max(WB$time_from_start, na.rm = TRUE)
  ),
  timepoint = c(
    "pre-smoking",
    "0-30 min",
    "31-70 min",
    "71-100 min",
    "101-180 min",
    "181-210 min",
    "211-240 min",
    "241-270 min",
    "271-300 min",
    "301+ min"
  )
)

timepoints_BR <- tibble(
  start = c(-400, 0, 40, 90, 180, 210, 240, 270),
  stop = c(
    0,
    40,
    90,
    180,
    210,
    240,
    270,
    max(BR$time_from_start, na.rm = TRUE)
  ),
  timepoint = c(
    "pre-smoking",
    "0-40 min",
    "41-90 min",
    "91-180 min",
    "181-210 min",
    "211-240 min",
    "241-270 min",
    "271+ min"
  )
)

timepoints_OF <- tibble(
  start = c(-400, 0, 30, 90, 180, 210, 240, 270),
  stop = c(0, 30, 90, 180, 210, 240, 270,
           max(OF$time_from_start, na.rm = TRUE)),
  timepoint = c(
    "pre-smoking",
    "0-30 min",
    "31-90 min",
    "91-180 min",
    "181-210 min",
    "211-240 min",
    "241-270 min",
    "271+ min"
  )
)

assign_timepoint <- function(x, timepoints) {
  if (!is.na(x)) {
    timepoints$timepoint[x > timepoints$start & x <= timepoints$stop]
  } else{
    NA
  }
}

We created a new column, timepoint_use, in each table by mapping the time_from_start values to specific timepoints defined in separate reference data frames (timepoints_WB, timepoints_OF, timepoints_BR). Finally, releveled the timepoint_use factor variable to align with the order specified in the reference data frames. This ensures consistent and meaningful timepoint labels for subsequent analyses or visualizations in the study.

 WB <- WB |> 
  mutate(timepoint_use = map_chr(time_from_start, 
                                 assign_timepoint, 
                                 timepoints=timepoints_WB),
         timepoint_use = fct_relevel(timepoint_use, timepoints_WB$timepoint))

OF <- OF |> 
  mutate(timepoint_use = map_chr(time_from_start, 
                                 assign_timepoint, 
                                 timepoints=timepoints_OF),
         timepoint_use = fct_relevel(timepoint_use, timepoints_OF$timepoint))

BR <- BR |> 
  mutate(timepoint_use = map_chr(time_from_start, 
                                 assign_timepoint, 
                                 timepoints=timepoints_BR),
         timepoint_use = fct_relevel(timepoint_use, timepoints_BR$timepoint))

remove duplicate id

WB <- drop_dups(WB)
OF <- drop_dups(OF)
BR <- drop_dups(BR)

Exploratory Data Analysis

compounds measurements over time by treatment

The following plots include of all compounds against time, distinguished by color according to their respective groups. To achieve a comprehensive understanding, we generated scatterplots for compounds across three distinct matrices—namely, whole blood, oral fluid, and breath. This analysis encompasses various timepoints and considers different treatments, namely, placebo, low dose, and high dose.

Upon close examination of the scatterplots, a noteworthy observation emerges, particularly concerning the THC biomarker in whole blood. This specific biomarker appears to offer a potentially enhanced indication of recent cannabis joint usage. The scatterplot reveals a discernible separation between the placebo and THC treatment groups, suggesting that the THC measurement in whole blood may serve as a more reliable indicator of recent cannabis joint consumption.

scatter_WB <- map(compounds_WB, ~ compound_scatterplot_group_by_treatment( 
    dataset=WB, 
    compound=.x, 
    timepoints=timepoints_WB))

scatter_OF <- map(compounds_OF, ~ compound_scatterplot_group_by_treatment( 
    dataset=OF, 
    compound=.x, 
    timepoints=timepoints_OF))

scatter_BR <- map(compounds_BR, ~ compound_scatterplot_group_by_treatment( 
    dataset=BR, 
    compound=.x, 
    timepoints=timepoints_BR))

In the presented set of scatterplots, all compounds are graphically depicted against time, with color distinctions denoting different treatment conditions and a log transformation applied to the y-axis, which represents the respective compound measurements. A comparative analysis with the previous scatterplots reveals a modification: specifically, a log transformation has been applied to the y-axis, providing an alternative perspective on the measurement of the compounds.

Upon closer examination, a notable observation emerges. The measurement of THC from breath exhibits a more discernible separation between the placebo and THC treatment groups in the log-transformed scatterplots. This suggests that the log transformation on the y-axis enhances the visibility of distinctions between the treatment conditions for THC. The log transformation, by compressing the scale, may unveil nuances and patterns that are not as apparent on a linear scale. This nuanced insight into THC measurements underscores the importance of considering the impact of transformation techniques when analyzing compound data over time in the context of different treatments. The enhanced separation observed in the log-transformed scatterplots could potentially provide valuable insights into the effects of treatments on THC levels and underscores the sensitivity of the chosen visualization approach.

scatter_WB_by_treatment <- map(compounds_WB, ~ compound_scatterplot_group_by_treatment_log( 
    dataset=WB, 
    compound=.x, 
    timepoints=timepoints_WB))

scatter_OF_by_treatment <- map(compounds_OF, ~ compound_scatterplot_group_by_treatment_log( 
    dataset=OF, 
    compound=.x, 
    timepoints=timepoints_OF))

scatter_BR_by_treatment <- map(compounds_BR, ~ compound_scatterplot_group_by_treatment_log( 
    dataset=BR, 
    compound=.x, 
    timepoints=timepoints_BR))

deleting compounds that obviously do not work from the compound data frame WB: cbd, thccooh, thccooh_gluc, thcv OF:thcoh

compounds_WB = compounds_WB[- c(2, 5, 6, 8)]
compounds_OF = compounds_OF[- c(4)]

Calculating sensitivity and specificity.

output_WB <- map_dfr(compounds_WB,
                     ~ sens_spec_cpd(
                       dataset = WB,
                       cpd = all_of(.x),
                       timepoints =  timepoints_WB
                     )) |> clean_gluc()

output_BR <- map_dfr(compounds_BR, 
                     ~ sens_spec_cpd(
                       dataset = BR,
                       cpd = all_of(.x),
                       timepoints = timepoints_BR
                     ))  |> clean_gluc()

output_OF <- map_dfr(compounds_OF,
                     ~ sens_spec_cpd(
                       dataset = OF,
                       cpd = all_of(.x),
                       timepoints = timepoints_OF
                     ))  |> clean_gluc()

cutoff vs. sensitivity/specificity

Here we plot the value of the cutoff against sensitivity and specificity for every compound in every matrix, and arrange them all into one big plot. This ROC curve of sensitivity and specificity against cutoff values suggests an exploration of optimal cutoff points, which is crucial for answering question 1.

#plot detection limit(cutoff) vs. sensitivity and specificity
ss_WB <-
  ss_plot(output_WB, tpts = length(unique(output_WB$time_start)), tissue = "Blood")

ss_OF <-
  ss_plot(output_OF, tpts = length(unique(output_OF$time_start)), tissue = "Oral Fluid")

ss_BR <-
  ss_plot(output_BR, tpts = length(unique(output_BR$time_start)), tissue = "Breath") #theres something very wrong with this one

#arranges ss plots into one
ss_bottom_row <-
  plot_grid(
    ss_OF,
    ss_BR,
    labels = c('B', 'C'),
    label_size = 12,
    ncol = 2,
    rel_widths = c(0.66, .33)
  )
plot_grid(
  ss_WB,
  ss_bottom_row,
  labels = c('A', ''),
  label_size = 12,
  ncol = 1
)

Average sensitivity and specificity vs. detection limit

output_WB_avg = average_sens_spec(output = output_WB)
output_OF_avg = average_sens_spec(output = output_OF)
output_BR_avg = average_sens_spec(output = output_BR)


ss_WB_avg_together <-
  ss_plot_avg_together(output_WB_avg, tpts = length(unique(output_WB$time_start)), tissue = "Blood")

ss_OF_avg_together <-
  ss_plot_avg_together(output_OF_avg, tpts = length(unique(output_WB$time_start)), tissue = "Oral Fluid")

ss_BR_avg_together <-
  ss_plot_avg_together(output_BR_avg, tpts = length(unique(output_WB$time_start)), tissue = "Breath")

i will now remove every compound where the average sens and spec does not intersect. reasoning: for compounds with no intersection, optimal sensitivity (left most point of the graph) = worst specificity. there is no room for adjustment because any adjustment from there on would just make everything worse.

compounds_WB = c("thc")
compounds_OF = c("thc")
compounds_BR = NULL

sensitivity vs. specificity

Here we plot sensitivity vs. specificity for every compound in every matrix, and arrange them all into one big plot

output_WB <- map_dfr(compounds_WB,
                     ~ sens_spec_cpd(
                       dataset = WB,
                       cpd = all_of(.x),
                       timepoints =  timepoints_WB
                     )) |> clean_gluc()


output_OF <- map_dfr(compounds_OF,
                     ~ sens_spec_cpd(
                       dataset = OF,
                       cpd = all_of(.x),
                       timepoints = timepoints_OF
                     ))  |> clean_gluc()

#plot sensitivity vs. specificity
roc_WB = roc_plot(output_WB, tpts = length(unique(output_WB$time_start)), tissue = "Blood")

roc_OF = roc_plot(output_OF, tpts = length(unique(output_OF$time_start)), tissue = "Oral Fluid")

# #arrange roc plots
# roc_bottom_row <-
#   plot_grid(
#     roc_OF,
#     roc_BR,
#     labels = c('B', 'C'),
#     label_size = 12,
#     ncol = 2,
#     rel_widths = c(0.66, .33)
#   )
# plot_grid(
#   roc_WB,
#   roc_bottom_row,
#   labels = c('A', ''),
#   label_size = 12,
#   ncol = 1
# )

It should be apparent that OF-THC is the superior choice. now we dig deeper into OF-THC and find the specific cutoff. referring back to the Average sensitivity and specificity vs. detection limit plot, we see that the detection limit is at…very close to 0 when both sensitivity and specificity are high. Let’s try out some more cutoffs close to 0.

plot sensitivity and speciticity over time given specific cutoffs

#pass specific cutoff into splits parameter
OF_THC <- sens_spec_cpd(
  dataset = OF,
  cpd = 'thc',
  timepoints = timepoints_OF,
  splits =  c(0.5, 1, 2, 5, 10)
) |> clean_gluc()

of_levels <- c("pre-smoking\nN=192", "0-30\nmin\nN=192", "31-90\nmin\nN=117",
               "91-180\nmin\nN=99", "181-210\nmin\nN=102", "211-240\nmin\nN=83",
               "241-270\nmin\nN=90",  "271+\nmin\nN=76")

plot_cutoffs(dataset=OF_THC, 
             timepoint_use_variable=OF$timepoint_use, 
             tissue="Oral Fluid", 
             cpd="THC", 
             x_labels=NULL)
## [[1]]

## 
## [[2]]
## # A tibble: 40 × 18
##       TP    FN    FP    TN detection_limit compound time_start time_stop
##    <dbl> <dbl> <int> <int> <fct>           <chr>         <dbl>     <dbl>
##  1     0     0    35   157 0.5             THC            -400         0
##  2     0     0    20   172 1               THC            -400         0
##  3     0     0     9   183 2               THC            -400         0
##  4     0     0     0   192 5               THC            -400         0
##  5     0     0     0   192 10              THC            -400         0
##  6   129     0    39    24 0.5             THC               0        30
##  7   129     0    30    33 1               THC               0        30
##  8   128     1    19    44 2               THC               0        30
##  9   128     1     3    60 5               THC               0        30
## 10   125     4     1    62 10              THC               0        30
## # ℹ 30 more rows
## # ℹ 10 more variables: time_window <fct>, NAs <int>, N <int>, N_removed <int>,
## #   Sensitivity <dbl>, Specificity <dbl>, PPV <dbl>, NPV <dbl>,
## #   Efficiency <dbl>, my_label <fct>

the average sensitivity is a lot more sensitive (ha) to change than the average specificity - specificity only dips in the 31-90min window when the cutoff is lowered, whereas a lower cutoff increases overall sensitivity all across the board, no matter the time. additionally, this 31-90min window where the specificity is heavily effected by a low cutoff is, in my opinion, trivial. it should be quite apparent that someone is high if they smoked within the last 90 min. a lowered specificity in this time frame isnt cause for too much concern, considering how much sensitivity is gained via using a low cutoff.

in a nutshell: a low cutoff is optimal. approxiamately somewhere between 0-2. let’s test more cutoffs in this range:

OF_THC <- sens_spec_cpd(
  dataset = OF,
  cpd = 'thc',
  timepoints = timepoints_OF,
  splits =  c(0.1, 0.25, 0.5, 1, 1.5)
) |> clean_gluc()

of_levels <- c("pre-smoking\nN=192", "0-30\nmin\nN=192", "31-90\nmin\nN=117",
               "91-180\nmin\nN=99", "181-210\nmin\nN=102", "211-240\nmin\nN=83",
               "241-270\nmin\nN=90",  "271+\nmin\nN=76")

plot_cutoffs(dataset=OF_THC, 
             timepoint_use_variable=OF$timepoint_use, 
             tissue="Oral Fluid", 
             cpd="THC", 
             x_labels=NULL)
## [[1]]

## 
## [[2]]
## # A tibble: 40 × 18
##       TP    FN    FP    TN detection_limit compound time_start time_stop
##    <dbl> <dbl> <int> <int> <fct>           <chr>         <dbl>     <dbl>
##  1     0     0    37   155 0.1             THC            -400         0
##  2     0     0    37   155 0.25            THC            -400         0
##  3     0     0    35   157 0.5             THC            -400         0
##  4     0     0    20   172 1               THC            -400         0
##  5     0     0    12   180 1.5             THC            -400         0
##  6   129     0    44    19 0.1             THC               0        30
##  7   129     0    44    19 0.25            THC               0        30
##  8   129     0    39    24 0.5             THC               0        30
##  9   129     0    30    33 1               THC               0        30
## 10   128     1    23    40 1.5             THC               0        30
## # ℹ 30 more rows
## # ℹ 10 more variables: time_window <fct>, NAs <int>, N <int>, N_removed <int>,
## #   Sensitivity <dbl>, Specificity <dbl>, PPV <dbl>, NPV <dbl>,
## #   Efficiency <dbl>, my_label <fct>

they all look pretty promising… we need a way to quantify this. I am gonna calculate the sensitivity and specificity for cutoff values in between 0 and 2.

output_OF = sens_spec_cpd_OFTHC(
                       dataset = OF,
                       cpd = "thc",
                       timepoints = timepoints_OF
                     )  |> clean_gluc()

output_OF_avg = average_sens_spec(output = output_OF)

output_OF_avg
## # A tibble: 101 × 4
##    compound detection_limit average_sensitivity average_specificity
##    <chr>              <dbl>               <dbl>               <dbl>
##  1 THC                 0                  0.956               0    
##  2 THC                 0.02               0.956               0.817
##  3 THC                 0.04               0.956               0.817
##  4 THC                 0.06               0.956               0.817
##  5 THC                 0.08               0.956               0.817
##  6 THC                 0.1                0.956               0.817
##  7 THC                 0.12               0.956               0.817
##  8 THC                 0.14               0.956               0.817
##  9 THC                 0.16               0.956               0.817
## 10 THC                 0.18               0.956               0.817
## # ℹ 91 more rows

lets plot this really quick

ss_OF_avg_together <-
  ss_plot_avg_together(output_OF_avg, tpts = length(unique(output_WB$time_start)), tissue = "Oral Fluid")

oh we found it. the place where they intersect is the maximum of sensitivity+specificity. lets get the specific value

output_OF_avg |>
  filter(abs(average_sensitivity-average_specificity) < 0.01) |>
  mutate(diff = abs(average_sensitivity-average_specificity))
## # A tibble: 5 × 5
##   compound detection_limit average_sensitivity average_specificity    diff
##   <chr>              <dbl>               <dbl>               <dbl>   <dbl>
## 1 THC                 0.82               0.893               0.890 0.00338
## 2 THC                 0.84               0.893               0.890 0.00338
## 3 THC                 0.86               0.893               0.890 0.00338
## 4 THC                 0.88               0.893               0.890 0.00338
## 5 THC                 0.9                0.893               0.890 0.00338

at cutoff 0.82-0.90, the difference between the average sensitivity and average specificity is minimized. we’ll pick 0.85 for aestheicism’s sake.

##conclusion the optimal biomarker is OF,THC, at cutoff = 0.85.

extended question

wrangling

WB_long = WB |>
  pivot_longer(6:13, names_to = "compound")

OF_long = OF |>
  pivot_longer(6:12, names_to = "compound")
  
BR_long <- BR |> pivot_longer(6)

df_full <- bind_rows(WB_long, OF_long, BR_long)

Part 1: so uh the “placebo groups might have different compound measurements” hypothesis has been disproven :(

This one is me testing out an extended question our group proposed: does taking a placebo have an effect on any of the compound?

Theoretically, no, it shouldn’t. But the graph about subjective highness being basically random really peaked my interest and I thought there might be something worth looking into. But the data disproves that theory :(

WB_long |>
  filter(treatment == "Placebo") |>
  filter(timepoint_use == "pre-smoking" |
           timepoint_use == "0-30 min") |>
  ggplot(mapping = aes(x = log(value))) +
  geom_histogram(binwidth = 0.5) +
  facet_grid(vars(timepoint_use), vars(compound), scales = "free")

###part 2: BUT i have a different extended questgion we can look into

This one is SUPER interesting. Some of the pairplots we went over in lecture notes 12 looked like they has two separate lines, so I thought one of the variables might change how compounds correlate with each other - and they do! Namely, the low dose and high dose group seems to have different slopes when it comes to the correlation between certain compounds.

Here’s a few of the more obvious ones:

ggplot(data = WB,
       aes(y = cbg, x = thc, color = treatment)) +
  geom_point(alpha = 0.5) +
  geom_smooth(
    method = "lm",
    # Use linear regression
    formula = y ~ x,
    # Specify the formula for the linear model
    se = FALSE,
    # Don't show the confidence interval
    data = WB |> filter(treatment == "5.9% THC (low dose)"),
    # Filter for low dose
    color = "green"
  ) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = WB |> filter(treatment == "13.4% THC (high dose)"),
    # Filter for high dose
    color = "blue"
  ) +
  facet_wrap(~ group, scales = "free")

ggplot(data = OF,
       aes(y = thc, x = cbg, color = treatment)) +
  geom_point(alpha = 0.5) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = OF |> filter(treatment == "5.9% THC (low dose)"),
    color = "green"
  ) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = OF |> filter(treatment == "13.4% THC (high dose)"),
    color = "blue"
  ) +
  facet_wrap( ~ group, scales = "free")

ggplot(data = OF,
       aes(y = cbn, x = cbg, color = treatment)) +
  geom_point(alpha = 0.5) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = OF |> filter(treatment == "5.9% THC (low dose)"),
    color = "green"
  ) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = OF |> filter(treatment == "13.4% THC (high dose)"),
    color = "blue"
  ) +
  facet_wrap( ~ group, scales = "free")

ggplot(data = OF,
       aes(y = cbg, x = thcv, color = treatment)) +
  geom_point(alpha = 0.5) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = OF |> filter(treatment == "5.9% THC (low dose)"),
    color = "green"
  ) +
  geom_smooth(
    method = "lm",
    formula = y ~ x,
    se = FALSE,
    data = OF |> filter(treatment == "13.4% THC (high dose)"),
    color = "blue"
  ) +
  facet_wrap( ~ group, scales = "free")

this seems to show that even though low dose/high dose doesn’t effect chemical concentration overall (as shown in the plots in lecture notes), they do change the correlation between chemicals. Maybe a potention extended question would be “do some compound repsond more to high dose/low dose compared to the other?” ```

Analysis

Data Analysis

Results

Discussion

Conclusion


    1. Marijuana at Work: What Employers Need to Know. NSC. https://www.nsc.org/nsc-membership/marijuana-at-work#:~:text=According%20to%20a%20study%20reported,Decreased%20productivity
    ↩︎
  1. Research Note: Results of the 2013-2014 National Roadside Survey of Alcohol and Drug Use by Drivers. NHTSA. https://www.nhtsa.gov/sites/nhtsa.gov/files/812118-roadside_survey_2014.pdf↩︎

  2. Drug-Impaired Driving. NHTSA. https://www.nhtsa.gov/risky-driving/drug-impaired-driving#:~:text=In%202007%2C%20NHTSA’s%20National%20Roadside,in%20less%20than%2010%20years.↩︎